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rj ABSTRACT 

Future weak lensing surveys potentially hold the highest statistical power for constrain- 
ing cosmological parameters compared to other cosmological probes. The statistical 
power of a weak lensing survey is determined by the sky coverage, the inverse of the 
Q noise in shear measurements, and the galaxy number density. The combination of the 

^ latter two factors is often expressed in terms of rics - the "effective number density of 

galaxies used for weak lensing measurements". In this work, we estimate ricff for the 
d Large Synoptic Survey Telescope (LSST) project, the most powerful ground-based 

lensing survey planned for the next two decades. We investigate how the following fac- 
tors affect the resulting rieff of the survey with detailed simulations: (1) survey time, 
^ (2) shear measurement algorithm, (3) algorithm for combining multiple exposures, (4) 

inclusion of data from multiple filter bands, (5) redshift distribution of the galaxies, 
^\ and (6) masking and blending. For the first time, we quantify in a general weak lensing 

t" — analysis pipeline the sensitivity of rioff to the above factors. 

We find that with current weak lensing algorithms, expected distributions of ob- 
ly-^ serving parameters, and all lensing data (r- and z-band, covering 18,000 degree^ of 

sky) for LSST, neff ~ 37 arcmin~^ before considering blending and masking, Ueff ~ 31 
arcmin^^ when rejecting seriously blended galaxies and ries ~ 26 arcmin"^ when con- 
T-H sidering an additional 15% loss of galaxies due to masking. With future improvements 

sjjl in weak lensing algorithms, these values could be expected to increase by up to 20%. 

. ^ Throughout the paper, we also stress the ways in which nog depends on our ability to 

understand and control systematic effects in the measurements. 
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1 INTRODUCTION 

Weak lensing is one of the most powerful tools for probing 
the dark matter distribution in our Universe and constrain- 
ing dark energy parameters ( Weinberg et al.||2012 l. Gravi- 
tational fields due to the large scale matter distribution per- 
turb light rays traveling from distant galaxies, causing the 
observed galaxy shapes to be slightly distorted compared to 
their true shapes. Since the original shapes are unknown, 
these weak distortions cannot be discovered by observations 
of individual galaxies. They can only be inferred via statisti- 
cal approaches, e.g., correlations of galaxy shape parameters 



as a function of angular separation (see, e.g., Bartelmann & 
|Schneider|2"00T| ). 

The ultimate statistical power for deriving cosmologi- 
cal parameters with weak lensing depends on the total sky 
coverage and the number density of galaxies with accurate 
shear measurements in a survey. One reduces the statisti- 
cal uncertainties from weak lensing by surveying wider and 
deeper fields. This has been the driver behind all ongoing 
and future weak lensing surveys [e.g., The Kilo Degree Sur- 
vejQ(KiDS), the Hyper Suprime-Cam Surve^Q (HSC), the 
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Dark Energy Survej|^ (DES), the Large Synoptic Survey 
Telescop(0(LSST), the Euclid missiorj^ and the Wide-Field 
Infrared Survey TelescopcQ (WFIRST)] . 

Using the weak lensing Fisher-matrix calculation intro- 



duced by Albrecht et al. ( 2006 hereafter A06), the statistical 



uncertainty of a weak lensing survey is determined by the 
combined quantity /s^^'^o"^, where fsky is the fraction of sky 
covered by the survey and is the uncertainty on the mean 
shear in unit area. The sky coverage fsky is limited by the 
accessible sky from a ground-based telescope, a^, however, 
cannot be straightforwardly calculated, since it contains not 
only the designed survey depth (the number of galaxies we 
see given the design of the survey) , but also the performance 
of the entire analysis pipeline one uses to measure these dis- 
tortions (the amount of the information we can actually ex- 
tract from each galaxy given the measurement methods). 

In A06, the question of calculating is rephrased in 
terms of calculating riofi, the effective number density of 
galaxies used for weak lensing measurements. A06 defined 
riefi to be the number density of perfectly measured galax- 
ies that would contribute the same amount of shear noise as 
the (imperfectly) measured ensemble of galaxies. The stan- 
dard deviation in each component of the ellipticity for the 
perfectly measured galaxy population (commonly known as 
"shape noise"), osn, is assumed to be fixed at 0.25 in A06. 
The term "noise" refers to the fact that the intrinsic galaxy 
shapes introduce uncertainty in the shear inferred from these 
galaxies. Most studies to date have adopted the formulae and 
model parameters in A06 to estimate the performance of fu- 
ture weak lensing surveys. However, the ways in which iicff is 
quoted in the literature are often inconsistent, causing con- 
fusion in the field. The main goal of this paper is to clearly 
define netr, estimate n^a for LSST, and quantitatively evalu- 
ate the sensitivity of n^s to different assumptions about the 
survey plan and analysis pipeline. Our methodology for cal- 
culating rieff is general and can be applied to other surveys 
given the basic survey and analysis information. 

In Section [2] we first review briefly the weak lensing 
notation and definition of Tiefi. We also discuss the relevant 
information about the dataset and the analysis pipeline re- 
quired to calculate rieff . We introduce the input galaxy cat- 
alog and the asN value for this work in Section [3] Then we 
show step by step in Section [4] how the shear measurement 
noise is estimated depending on different analysis methods 
and galaxy selection. We calculate in Section |5] the rieH val- 
ues for LSST under different scenarios and discuss in depth 
the different factors that affect n^s. Finally, we estimate in 
Section [6] how n^s degrades when practical effects such as 
masking and blending are introduced. The same calculation 
is then applied to an existing survey in Section [7] to demon- 
strate the generality of our approach. We summarize our 
results in Section HI 



^ http : //www . deirkenergysurvey . org/ 
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2 OVERVIEW OF THE PROBLEM 
2.1 Weak lensing notation 

Throughout the paper, we measure object shapes using the 
2-component ellipticity spinor: 



where 



e = ei + ie2 



111 - h 
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The hj are the normalized moments of the object's light 
intensity profile I{xi,X2): 



J J dXldX2l{xi,X2)XiXj 
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(3) 



Under this definition, the measured ellipticity e changes ac- 
cordingly in the presence of shear 7 = 71 -|- 172 and conver- 
Bartelmann fc Schneider||200T l: 



gence k (see, e.g., 



(6«+g)(l + g*6«)-i ,|gKl 
(l + ge^''){e^'*+g*)-' ,\g\>l 



(4) 



where e" = ef + iei refers to the intrinsic ellipticity of the 
galaxy before shearing, the asterisk denotes the complex 
conjugate, and g ~ gi + ig2 is the "reduced shear" defined 
by 

Here we have adopted the approximation that 7 ~ g in the 
limit of weak lensing where n <^ 1. 

In the presence of noise, a weighting function W{xi, X2) 
is included in the integrands in Equation[3]to reduce the fluc- 
tuations in ellipticity measurements. The width of W{xi , X2) 
is approximately the size of the observed object - this yields 
the maximum signal-to-noise ratio for each individual ob- 
ject. Due to imperfect point-spread-function (PSF) mod- 
els and this weighting function W{xi,X2), Equation |4] is no 
longer exact. As a result, a calibration factor (often referred 
to as the "shear responsivity") is introduced to correct for 
this effect ( [Heymans et al.||2006[ |Massey et aL]|2007[ [Bridle 



et al.|2010[ ) 



2.2 Relation between shear noise and ricff 

As mentioned earlier, the relevant quantity in measuring the 
statistical power of a survey is the uncertainty on the mean 
shear per unit area, or (given fixed fsky). For each galaxy, 
since the shear noise results from the intrinsic shape noise 
as well as measurement noise, we can write. 



2 2 

""SAT + ""m.i 



(6) 



|http: //wflrst .gsfc ■ nasa.gov/ 



where we have assumed that shape noise is uncorrelated 
with measurement noise. The subscript i indicates this is the 
shear noise for the ith galaxy and the subscript m refers to 
the measurement noise, a indicates the Root-Mean-Square 
(RMS) of the distribution. Note that measurement noise de- 
pends on the galaxy's shape, size and brightness, while shape 
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noise is usually taken to be constant for the entire galaxy 
sample (however, see discussion in Section [3|. 

If we assume the mean shear estimation 7 is calculated 
by the weighted mean of the shear over the entire sample, 
where the weight is just the inverse variance in each mea- 
surement, then we have: 



1 



(7) 



separated the systematic errors from the statistical errors 
in the rics calculation for simplicity. However, as we discuss 
in Section |4j UcH depends on several factors in the analysis 
pipeline that are set by requirements on systematic errors in 
shear measurements. As a result, ncH can be coupled with 
the systematic errors in an indirect way. The exact tradeoff 
between systematic and statistical errors for cosmic shear 
measurements, and the effect on Ticff is algorithm-dependent 
and beyond the scope of this paper. 



a-f is equal to the survey area times the variance in 7, which 
is derived from the shear noise in individual galaxies, a^^i: 



= nVar(7) = n 



iVcft 



„2 
<^SN 



(8) 



where A'^ is the total number of galaxies used in the lens- 
ing analysis and fl is the total sky coverage. The last two 
factors in Equation |8] provide the operational definition of 
rieff, where we have defined Nes to be the effective number 
of weak lensing galaxies corresponding to this galaxy sample 
and rics = Ncs/^l. Rearranging the terms in Equation|8]and 
using Equation |6] leads to the following relation: 



Tlcff = 



2 

<^SN 



_ J- -^N <lSN 
^ o > 2 



J2 

<^SN 



(9) 



Recall the measured weak lensing power spectrum can be 
written as ( [Amara fc Refregier|2008| 



(10) 



where i and j denote two redshift bins, C^j{i) is the mea- 
sured lensing power spectrum, P^jil) is the true lensing 
power spectrum, 5ij is the Kronecker delta function and 
C^j'^^^ii) is the systematic error in the shear power spec- 
trum measurement. The uncertainty in the measured weak 
lensing power spectrum can be written as 



AP5(^) 



(11) 

This clearly displays what was mentioned earlier - the statis- 
tical uncertainty in cosmic shear measurements (the second 
term in the square brackets) is determined by the factor 
/7fc°/*7' or equivalents, /7fc°/cr|ivn7(j^ 

From Equation [9] we olaserve that calculating ricfj for 
LSST involves a combination of considerations. First, we 
need an understanding of the intrinsic distribution of galax- 
ies in the multi-dimensional space {e.g., size, magnitude, red- 
shift, shape, etc.), given the depth of the LSST dataset. Sec- 
ond, we need to understand the expected shear measurement 
error for each galaxy, which depends on the characteristics of 
the galaxy, the measurement algorithm, how multiple mea- 
surements of the same galaxy are combined, and considera- 
tions of systematic errors in the measurement. We address 
the first part of the problem (the intrinsic galaxy distribu- 
tion) in Section [3] and the second part (the shear measure- 
ment error) in Section |4] It is important to realize that even 
for the same dataset, it is possible to get different Wcff val- 
ues depending on the different choices one makes with the 
analysis pipeline. Thus one needs to be careful when quoting 
or comparing these numbers, to give enough information on 
the assumptions involved. 

We also note that in Equation [Tl] we have intentionally 



3 SHAPE NOISE AND THE INTRINSIC 
GALAXY DISTRIBUTION 

To start, we need a realistic galaxy catalog that contains the 
primary characteristics (redshift, size, magnitude and shape) 
of the galaxies expected to be seen in a 10-year LSST weak 
lensing dataset. The LSST weak lensing survey is expected 



to image 18,000 square degrees (Ivezic & the LSST Science 
Council|[20TT| ) of the sky in six filter bands (ugrizy) to a 
median redshift of ~1.2 and depth of r ~ 27.5 and i ~ 26.^ 



( Ivezic et al.|2008 l. For this study, we use a typical simulated 
galaxy catalogue generated by the LSST Catalog Simulator 
( Connolly et al.|in preparation CatSim). 

We briefly describe here the key steps and references 
for crating the galaxy catalog to help readers understand 
the results of this work. CatSim generates galaxy catalogs 
with realistic galaxy morphologies (Sersic bulge+disk galax- 
ies, where the bulge and disk components have Sersic index 
4 and 1, respectively), apparent colors and spatial distribu- 
tions, and redshifts extending up to 5 on an area of 4.5 x 4.5 
square degrees. The simulations include galaxies with r- 
band AB magnitudes brighter than 28. In [Connolly et al.| 



in preparation!, the galaxy number density as a function of 



magnitude and redshift in the CatSim catalog is shown to 
be we ll matched to observations in the D EEP2 Redshift Sur- 
ve^f] ( [Davis et al.|2003| |Coil et aL]|2004[ ). Finally, we assign 
shapes, or apparent ellipticities to each ga laxy according to 



those measured in the COSMOS datasel 
^^2007 , private communication) . In Figure 



( Leauthaud et al. 



ll we show the red- 
and i-band (the two main lensing bands) 
£2) distributions of 



shift, apparent r 

AB magnitude, size and ellipticity (ei 
the galaxy population used in this study. The magnitude and 
size of each galaxy will be used to calculate the signal-to- 
noise ratio (Equation 1 14[ | and effective size (Equation 15 I of 
each measurement (see Appendix |A|. These two quantities 
determine the measurement noise, am, for each galaxy (see 
Section|4|. The RMS width of Figure[T|(d) gives asN ~ 0.26. 

Although we assume shape noise to be independent of 
galaxy morphology, redshift, size and magnitude in our cal- 
culations, this is not strictly true. Hwang & Park (20091, 



for example, showed that the ratio of early-type (elliptical) 

^ This magnitude limit is defined as the r-band AB magnitude 

at 5(T for a point sou rce. 

* http: //deep .ps .uci . edu/| 

^ Since we are extracting the distribution from the COSMOS 
measurements directly, the ellipticity we assign to the galaxies 
include the galaxy intrinsic shape and cosmic shear. However, we 
note that the level of cosmic shear is over an order of magnitude 
smaller than the level of the galaxies' intrinsic shape noise; i.e.. 
Figure [l](d) is dominated by shape noise. 
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to late-type (spiral and irregular) galaxies at low redshift 
is higher compared to that at high redshift. This could in 
principle introduce a redshift-dependent shape noise. How- 
ever, as shown in Leauthaud et al. ( 2007| and Joachimi et al 



( 2013 1, the estimated shape noise as a function of galaxy red- 



shift, size and magnitude is consistent with being flat within 
measurement noise in the COSMOS data. As a result, we 
choose to make the first-order approximation in this work 
that shape noise is constant as measured in [Leauthaud et aT] 
(20071. A non-constant cjsn correction to this approxima- 



tion will require further investigation with deep space-based 
data {e.g.. 



Jee & Tyson 20111 and better shear measure- 



ment algorithms. 



4 THE SHEAR MEASUREMENT NOISE 

In this section we estimate the shear measurement noise, 
(Tm, for a large range of galaxies under realistic observing 
conditions using high-fidelity simulations. We then select the 
galaxies used for weak lensing measurements. 



4.1 Simulation and shear measurement 



We invoke the LSST Photon Simulator v3.q^ ( [Peterson 
[et al.|in preparation[ PhoSim) to generate high-fidelity im- 
ages with realistic noise and instrumental/atmospheri c ef- 
fects. We also use the LSST Operations Simulatoij^ ( [Del 



[gado et al.[[2006[ OpSim) catalog v3.61 to derive the ex- 
pected distribution of observing conditions. PhoSim is a 
fast Monte Carlo photon ray-tracing code that simulates all 
the major physical effects from the atmosphere down to the 
CCD readout. It adopts an atmospheric model with multi- 
layer Kolmogorov turbulent screens distributed from a few 
meters to a few tens of kilometers above the telescope. Each 
screen is described by several parameters associated with 
the wind speed/direction and turbulence strength. The op- 
tics model in PhoSim is based on the most up-to-date en- 
gineering design, with optical errors {e.g., optics element 
mis-alignment, mirror surface perturbation, tracking errors, 
etc.) at the level set by engineering specifications. OpSim, 
on the other hand, models the telescope configuration, slew- 
ing mechanism, weather and sky conditions at the LSST site 
in a 10-year period. Figure [2] shows the atmospheric seeinj^ 
and sky background distribution in r- and i-band from the 
OpSim catalog used in this paper. Due to the wavelength- 
dependent nature of the system throughput and PSF size, 
the best possible image quality (maximum signal-to-noise 
ratio + minimum PSF size) is usually achieved in the r- 
and j-band images. For optimal results in weak lensing mea- 
surements, the OpSim algorithm would thus preferentially 
choose to image in one of these "lensing bands" when the 
observing conditions are good. 

With the input galaxy catalog from CatSim and the ob- 
serving parameters from OpSim, we use PhoSim to generate 



'http: //dev. Isstcorp. org/trac/wiki/IS_pliosim 



http : // ssg. astro .Washington. edu/elsst/opsim. shtml 



On top of the atmospheric seeing, we add the instrumental 
PSF 0.4" in quadrature to get the total PSF size used in later 
calculations. 



a set of 1,000 simulated 15-s r-band LSST exposure on a 
single LSST CCD sensor. For the 1,000 exposures, we sim- 
ulate galaxies with the redshift, size, magnitude and shape 
distribution in Figure [l] and randomly assign observing 
conditions based on the distributions in Figure [2] We select 
10 random locations on the LSST focal plane for these sim- 
ulations to capture realistic PSF effects across the field of 
view. For each exposure, we make two images: The first im- 
age contains galaxies from CatSim, while the second image 
is identical to the first, except that we replace each of the 
galaxies with a bright star that effectively samples the PSF 
of its galaxy counterpart in the first image. 

For each simulated image, we detect objects using the 
Source Extractor software (Berlin & Arnouts 19961) and 



measure shear using the imcat software packagj^ The 
shear measurement algorithm imcat performs the follow- 
ing steps: First, the shapes of both stars and galaxies are 
measured. Then, for each galaxy in the first image, the PSF 
effects are corrected using the corresponding bright star in 
the second image. Finally, a shear value is calculated for each 
galaxy. The difference between the measured shear and the 
input shear is the shear measurement error, or 



57 = 7 

measured Tinput 



(12) 



We use IMCAT for the shape/shear measurement because it 
is one of the most commonly used methods for weak lens- 
ing studies. The results from this paper will thus be directly 
relevant to the ongoing and future surveys. But given recent 



et al. 2009 



improvements in the shear measurement methods (Bridle 
Kitching et al.[[2012a|b I, our results are conser 



vative in this regard. 

Note that in this analysis framework we have ignored 
several effects from a realistic stellar population. First, we 
choose to estimate ricff for a typical weak lensing field, with 
Galactic latitude of |&| ~ 60 degrees. The stellar density 
in these fields is relatively low (~ 1 arcmin"^), such that 
significant blending from stars can be ignored. Second, by 
simulating the "true" PSF for each galaxy via simulations (in 
the second image) , we have avoided the process of PSF mod- 
eling. This includes selecting well-measured stars and mod- 
eling the PSF by interpolating shape parameters of these 
stars. The effect of imperfect star selection is expected to be 
small, since the bright stars used for PSF modeling are gen- 
erally easy to identify without much contamination from the 
galaxies. The effect of interpolating shape parameters from 
stars depends on the interpolation scheme. In Appendix [B] 
we show that using a conventional interpolation method for 



In the nominal survey plan for LSST, the telescope will take 
two 15-s exposures separated by 2-3-s readout with all instru- 
ment configurations held fix. This full sequence is called a "visit", 
which is effectively a continuous 30-s exposure. To model this, we 
estimate am in single 15-s exposures from the simulations and 
assign identical observing parameters for the two exposure in the 
same visit. 

^■^ We sample galaxy parameters from the 3D redshift-size- 
magnitude distribution, and assign a random ellipticity from Fig- 
ure [Tl(d). 

http : //www. if a.h awaii ■ edu/~kaiser/imcat/[ The shear 
measurement algorithm in imcat is based on the KSB algorithm 
developed by [Kaiser, Squires &: Broadhurst[ | [1995[ l ; [Luppino fc[ 
Kaiser|(jl997|l andlHoekstra et al.| ||1998||. 
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disk half-light radius (arcsec) 

Figure 1. Distributions of (a) redshift, (b) r- and i-band magnitude, (c) size, and (d) single-component ellipticity in the galaxy catalog 
used for this study. All histograms are normalized to unit area. The sizes of the galaxies are measured by the second-moment radius, 
which can be derived from the half-light radius of the bulge and disk component and the bulge-to-disk flux ratio (Appendix |A1| . In (e) 
we show how the half-light radius of the bulge and disk component and the bulge-to-disk flux ratio are distributed in the galaxy catalog. 
Notice that there are bulge-only and disk-only galaxies, which lie on the two axes in the plot. 
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Figure 2. Distributions of (a) r- and i-band atmospheric seeing and (b) sky background for the expected 10-year LSST survey as 
simulated from OpSim. All histograms are normalized to unit area. 




100 



(a) 




Figure 3. Distribution of the (a) signal-to-noise ratio (u) and (b) effective size (R = ^gj^j/rpgp) as measured from the simulations. The 
median values of these distributions are listed on the plots and all histograms are normalized to unit area. 



PSF modeling yields only a small (3-6%) degradation in n^s 
compared to the case for which the PSF is known exactly. 

Our approach of estimating shear measurement noise 
from PhoSim is very similar to that used in [Bard et al.| 
(|2013[). The major differences between the simulations de- 



scribed here and that in Bard et al. (20131 are (1) we simu- 



late single-exposure depths instead of the expected full sur- 
vey depth, and (2) we do not include shape noise in our 
definition of measurement errors. 



4.2 Single exposure shear measurement noise for 
a single galaxy 

To characterize the shear measurement noise am, we follow 
the procedure suggested by [Bernstein fc Jarvis ]( [2002l ), and 
assume that shear measurement noise (am) is mainly deter- 



mined by the signal-to-noise ratio (i/) and the effective size 
(R) of the measured galaxy via the following form: 



1 + 



(13) 



where (a, b, c) are coefficients depending on the shear algo- 
rithm and the image quality. 

Here, the galaxy's signal-to-noise ratio (v) is defined as 

^ (14) 



VsTb 

where S is the source photon counts and B is the back- 
ground photon countj^ And the galaxy's effective size (R) 



In practice, both of these are calculated in an aperture - we use 
the convention adopted by Source Extractor, where they define 
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is defined to be the relative size of tlie galaxy to the PSF, 
or 



R = 



(15) 



Here, rgai and rpsF are the second-moment radii (see Equa- 
tion |A5| of the galaxy and the PSF's light distribution. In 
Appendix [X] we show how v and R can be calculated from 
observational quantities as well as the generic galaxy model 
described in Section [S] and observational parameters. 

The main task of this section is to determine the co- 
efficients in Equation [13] from the shear measurements in 
Section [4.1[ To do this, we first calculate the u and R val- 
ues for all galaxies measured in Section |4.1| and bin them 
into {v, R) bins. Then, for all galaxies in each (i/, R) bin, 
we calculate the RMS of the shear measurement errors 
(Equation 12 1. This provides an estimate for am for that bin. 
We then fit the am{i^, R) surface and determine the (a, 6, c) 
coefficients. 

We use 30 logarithmic bins in ranging from 3 to 650 
and 30 logarithmic bins in R ranging from 0.2 to 43, which 
covers all the galaxies measured in these simulations. Since 
each bin contains a different number of gala^cies, an uncer- 
tainty estimate of ffm/V-^bin is used for each bin during 
the fit, where Nhin is the number of galaxies in that bin. 
The distributions of the u (for any R) and of R (for any 
ly) measured from single-exposure simulations are shown in 
Figure[3] and a typical histogram for one of the bins is shown 
in Figure |4] As can be seen, the shear measurement error is 
slightly non-Gaussian with some low-level wings. 

We derive the best-fit coefficients of Equation [13] from 
the above procedure to be (a, 6, c) — (1.58,5.03,0.39). The 
RMS difference between the fit and the measured values 
(weighted by A'^bin) is ~ 0.1. The fitted surface is shown in 
Figure [5] In general, the behavior of the measurement noise 
is intuitive - small and faint galaxies have noisier measure- 
ments and large, bright galaxies are well measured. 

We can now estimate the shear measurement noise (o"™,) 
for each galaxy in each exposure by plugging the v and R 
values (for each galaxy in each exposure) into Equation 13 
In this calculation, four parameters from the CatSim cat- 
alog (the apparent magnitude, the galaxy bulge and disk 
half-light radii, and the bulge-to-total fiux ratio) and two pa- 
rameters from the OpSim catalog (the seeing and sky back- 
ground in each exposure) are used. Note that we have de- 
rived the model of shear measurement noise (am) via galax- 
ies in a certain 1/ and R range that can be detected and mea- 
sured in single-exposure simulations (Figure [sjl. For galaxy 
measurements with v and R values outside this range, our 
estimation of am would be an extrapolation. 



4.3 Combining multiple shear measurements of a 
single galaxy 

Shear measurements from the same galaxy imaged in mul- 
tiple exposures are combined to give a single estimate of 




Figure 4. Distribution of tlie shear measurement error in the 
u ^ 15, R ^ 1 bin. The overlaid red dashed curve shows the best- 
fit Gaussian, which does not capture the wings of the distribution. 
The shape of this distribution is typical for most {u, R) bins. 
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total light distribution (Equation |A9| [Kron|1980l l. This aperture 
size has been shown to contain > 90% the total signal, for a wide 
range of galaxy shapes. 



Figure 5. Shear measurement noise {cTm) as a function of galaxy 
signal-to-noise ratio {u) and effective size (R). As discussed in 
Section |4.4| the three solid contours correspond to the galaxy 
selection cuts (see Equation |19| for definition of k) we apply. The 
dashed lines indicate a conventional 2D cut {1/ > 20, R > 0.75) 
that shares the same maximum measurement error as the k = 1 
cut. 



shear per galaxy. For LSST, this is a crucial step because 
the number of exposures of each galaxy is typically an order 
of magnitude larger than for previous surveys. That is, the 
algorithm one uses to combine these shear measurements is 
important. 

Different approaches have been suggested to deal with 
such multi-epoch datasets. Conventionally, one would create 
co-added images and carry out the full analysis on the co- 
add. This is not an optimal approach, for one throws away 
information obtained in the sharpest images and further- 
more, correlates the noise in the pixels and creates biases 
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in the shear estimates. Bernstein & Jarvis ( 2002 1 and more 



recently Miller et al. ( 2007 1 and Tyson et al. ( 2008 1 have sug- 



gested taking the approach of joint fitting, where the indi- 
vidual exposures are kept separate throughout the analysis. 
The detections would still be made on a coadded image to 
maximize signal-to-noise ratio in the detection process. But 
then the shape measurement would involve a joint fit using 
all the original pixels from the individual exposures, making 
it possible to weight the measurement in different exposures 
according to the noise on that particular exposure. That is, 
we extract more information from good images and less in- 
formation from bad images. The latter approach is optimal 
when the image quality varies across exposures. 

First, we consider the case for which the shear is esti- 
mated jointly using all the original (individual) images. In 
this case, the optimal joint estimator will have a net mea- 
surement error, am,joint, of 



1 



(16) 



where (Jm,j is the measurement error estimate from Equa- 
tion [l3] for each exposure j out of the TVoxp total exposures. 

For the second case of combining the images and then 
measuring shear from the coadded image, we can calculate 
the effective v and R values using the total signal and back- 
ground and an estimate of the net PSF size from adding the 

iVc: 



exposures: 



Btot = ""^ Bj , 

2 1 r^iVoxp 2 

'■^PSF.cff — -jTf ''PSFJ , 

iVc: — 



(17) 



' cxp 
•S'tot 



Reft — 



V S'tot + -Btot 



and 



f^mjCoadd 



1 + 



Reft 



(18) 



In principle, (a', b' ,c') and (a, b, c) need not be identical. The 
difference can be due to the specific PSF modeling technique 
one uses. This means that the fit shown in Figure [s] (based 
on single exposure measurements) may no longer be appro- 
priate in the case of coadded images. Nevertheless, for the 
purpose of this analysis, we avoided most PSF-related is- 
sues and specifically explored a wide range of reasonable 
{ly, R) values. As a result, we will make the approximation 
{a',b',c') ~ (a,&, c), effectively generalizing Equation 
all {v, R) values. 

We first consider the case for which only r-band im- 
ages are used and then consider the case for which both r- 
and j-band images are used. For LSST, we expect a similar 
number (~ 368) of r- and i-band 15-s exposures in the full 
10-year dataset. Note that the technical difficulties of com- 
bining images for different bands still need to be assessed 



13 



to 



4.4 Galaxy selection 

In all weak lensing analyses to date, one does not use all 
the galaxies that are detected and identified as galaxies to 
conduct cosmic shear measurements. Instead, only galaxies 
that pass certain criteria make it into the final summation 
in Equation [9] Although noisy galaxies naturally have low 
weights in Equation |9j there are several practical reasons 
that one rejects part of the galaxy population: First, most 
shear measurement algorithms become numerically unsta- 
ble when working with noisy galaxies. Second, these noisy 
galaxies are subject to noise bias ( [Refregier et al.|2012|per| 
chior fc Viola|[20T2 l and can increase the systematic errors 
significantly. Finally, assuming one were to calibrate these 
systematic errors with simulations ( jKacprzak et al.]|2012[ ), 
the calibration process becomes challenging and unstable as 
the measurement noise increases. 

Given the reasoning above, a natural route to select 
galaxies is to base the selection on the relative level of the 
shear measurement errors (am) and the shape noise (crsjv). 
In other words, we use only galaxies that satisfy: 



< k asN, 



(19) 



where k is of order unity and depends on the performance 
of the shear measurement algorithm. In the main analysis 
of this paper, we choose three k values based on studies 
in [Bridle et ah] (|2010[) and [Kitching et al.| <|2012a[). These 



studies have shown that most current shear measurement 
algorithms perform well on objects with (i^, R)~ (40, 1.5), 
operate with moderate accuracy on objects with {u, R) ~ 
(20, 1), and tend to fail on objects with (i^, R) ~ (10, 0.5). 
These three cases roughly correspond to k =(2.0, 1.0, 0.5) 
if we assume the the shear measurement noise follows Fig- 
ure |5] In the rest of this paper, we will thus consider these 
three galaxy selection cuts, where fc = 1.0 corresponds to the 
fiducial case, k = 2.0 corresponds to the optimistic case, and 
k = 0.5 corresponds to the conservative case. The fiducial 
case corresponds to using a shear measurement algorithm 
with accuracy similar to current state-of-the-art methods. 

We note that a more common approach in existing weak 
lensing analysis pipelines is to select galaxies based on a 2D 
cut in the i^-R plane. However, we argue that galaxy selec- 
tion based on such a cut is not necessarily optimal. As illus- 
trated in Figure [5] the two galaxy samples selected by the 
fc = 1 contour and the dashed rectangle (equivalent to a 2D 
cut of > 20, R > 0.75) have the same allowed maximum 
measurement noise. But the dashed line clearly removes less 
parameter space than the fc = 1 contour, thus leading to 
smaller ricff . This simple illustration demonstrates that if the 
measurement noise can be properly estimated, the most ef- 
ficient way to select galaxies for weak lensing measurements 
is to base the selection on the measurement noise directly 



(Equation 19 1 



(see also discussion in Section 5.4 1 



Finally, we consider the range of redshifts for galaxies 
used in the analysis. It is common for lensing analyses to 
only consider a limited redshift range since objects at very 
high redshift will have photometric redshifts that are poorly 
estimated, while objects at very low redshift do not provide 
much cosmological information. As a result, we consider only 
galaxies in the redshift range 0.1- 3. 
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Table 1 . Summary of the effective number of galaxies per square 
arcminute used for weak lensing analyses, or n^g, derived in Sec- 
tion |5] for the 10 -year data from LSST. The columns under n 
list the raw galaxy number densities for different galaxy selec- 
tion cuts (Equation |19[ , while the columns under n^fi list the 
corresponding effective galaxy number densities. In the first row, 
we use r-band images only and combine the multiple exposures 
via joint-fitting. The second row shows how the numbers change 
as we adopt a co-add approach instead. The last row shows the 
case for which both r- and i-band images are used. The bold face 
values are used later in Section |6] 
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We now estimate ricff by combining the analyses from Sec- 
tions [3] and [4] into Equation |9] First, we estimate the shear 
measurement error on each galaxy in the galaxy catalog af- 
ter combining A^exp exposures. Then we reject galaxies with 
measurement errors and redshift values that fail the selec- 
tion criteria set in Section |4.4[ The remaining galaxies are 
used to calculate ricft using Equation [9] and crsjv = 0.26. We 
consider two combining approaches (co-add and joint-fit), 
and also the possibility of combining r- and i-band data. 
Table [l] and Figure [6] summarize the results of our analysis. 
We discuss below several issues related to estimating riofi. 



5.1 Time dependence of ricfj 

In Figure |6j we show rics as a function of number of expo- 
sures combined, or, equivalently, survey time. Figure [6] (a) 
shows the rieff behavior for r-band only and for different k 
values as described in Section 4.4 Figure [6] (b) shows, for 
the fiducial case (fc = 1-0), how the result will change as 
one considers combining multiple exposures differently and 
when i— band data are included. The general trends for all 
plots are similar: Wcfi increases monotonically and does so 
faster in the beginning of the survey. In all cases studied 
here, the curves do not plateau during the survey. 

As the exposure time increases, the same object will be 
measured with larger signal-to-noise ratio on the combined 
image and is more likely to survive the Gm cut. Given that 
the number of galaxies increases dramatically as one goes 
to the fainter end of the galaxy population (see Figure [l] 
(c)), we can expect a sharp increase in nog over time. How- 
ever, as the number density of galaxies surviving the (j„i 
cut increases, blending may become an issue. We estimate 
in Section [6] the possible effect of blending on riefi. We also 
note that the rate of increase of riefi is slower than the naive 
expectation of ^ A'exp • This is because we place a cut on Om 
and not simply on signal-to-noise ratio. Thus Ticff will not 



necessarily scale with \fNl- 

In Figure [6] (b), we first look at the effect of com- 
bining multiple exposures using a more traditional co-add 
method versus the joint-fitting method described in Sec- 
tion [Xs] (comparing the solid curves and the dashed curves). 
The two curves start off at similar levels. Approximately half 
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Figure 6. riefi ^ ^ function of the number of exposures, or op- 
eration time (as listed on the top axis). In (a), we show the case 
for which r-band data are combined with a joint-fitting approach 
and with different galaxy selection cuts (Equation |19[ l. The red, 
green and blue curves are for the optimistic (fc = 2.0), fiducial 
(fc = 1.0), and conservative {k = 0.5) scenario, respectively. In 
(b), we show for the fiducial case, how the curves change when 
one combines multiple exposures via a co-add method (dashed 
curves) and when multi-band data are included (dotted curves). 
The three vertical dashed lines show the approximate A^exp val- 
ues corresponding to the equivalent depth of the three ongoing 
surveys: KiDS, DES and HSC. 



a year into the survey, UcS becomes slightly larger for the 
joint-fit approach. This is due to the fact that the joint-fit 
method is optimal at extracting information from multiple 
exposures with very different observing conditions, and that 
effect becomes more pronounced as one collects more data. 
When including the i-band data, a fractional increase of 20- 
30% in riefi is shown throughout the survey period (compar- 
ing the solid curve and the dotted curve). 

To compare the performance of LSST with ongoing sur- 
veys, we also label on the plot the "equivalent A'exp values" 
for KiDS, HSC and DES. These are rough estimations based 
on the r-band limiting magnitudes for the three surveys. We 



10 C. Chang et al. 



expect the source flux corresponding to the hmiting magni- 
tude to scale inversely with the square-root of the survey 
time, and thus Ncxp- In other words, 



10 



-0.4(m^-m 



-^cxp 



(20) 



where m is the r-band limiting magnitude and the super- 
script denotes the survey of interest. For LSST, m'^^^^ = 
27.5 and Nl'^^'^ = 368. Given the hmiting magnitudes 
for the three survevs: 25.2 (KiDS), 27 (HSC) and 25.6 
(DES), Equation [20| yields: A^exp^^ = 52, N^^^ = 335 and 
-^exp^ = 97. Note that this estimation does not account for 
the different image quality and measurement errors in the 
different surveys. 



5.2 Effect of weighting and galaxy selection cut 

As expected from Equation [9] riofi is always smaller than 
the raw number of galaxies surviving the am cut, or n. We 
compare the first row in the n and riefi columns of Table [l] 
We find that n^s is 61%, 79%, 90% of n for k =2.0, 1.0 and 
0.5, respectively. This rapid increase in the ricft-to-n ratio is 
sensible: Smaller k cuts suggest a lower-noise galaxy sample, 
and that means each galaxy will contribute more to n^B . Low 
measurement noise is key to a larger noff value, for it affects 
riefi in two ways: (1) it enables small and faint galaxies to 
pass the am cut and (2) it weights individual galaxies more 
in the summation in Equation [9] 

We also examine the effects of using a conventional two- 
dimensional galaxy selection cut in the iz-K plane. We cal- 
culate rieff in the case for which the two-dimensional cut 
{v > 20, R > 0.75) is used instead of the measurement noise 
cut am < asN (k — 1). As shown in Figure [5] the two- 
dimensional cut corresponds to a smaller parameter space 
on the u-K plane. This reduction in parameter space results 
in a significant (~ 32%) decrease in Ucs- 

Finally, the redshift selection cut we apply reduces neft 
only slightly, at the <1% level. 



5.3 Co-add vs. joint-fit 

In practice, there are several incentives to combine multi- 
ple exposures via a joint-fit method rather than a co-add 
method. This includes avoiding the process of homogeniz- 
ing the multiple exposures and reducing correlated noise in 
the images ( Miller et al.|20l"2 l. The statistical power is not 
commonly considered when making this choice. However, 
using a joint-fit method naturally yields higher statistical 
power compared to using a co-add method. The improve- 
ment comes from the fact that for a co-add method, only 
galaxies well measured in the "average" exposure are used, 
while this is not necessarily true for the joint-fit method. 
It is possible to use galaxies that are not well measured in 
the "average" exposure in a joint-fit method, as long as the 
galaxies are sufficiently well measured in some of the ex- 
posures. We have estimated in Table [l] a small decrease in 
ricff (~ 7%) going from a joint-fit to a co-add approach. 
We note that in this calculation, our neft estimation for the 
co-add approach is less accurate than that for the joint-fit 
approach. This is because, as noted in Section [4. 3| we have 



extrapolated Figure[5]from single-exposure results to the co- 
add regime. In the co-add images, not only does the photon 
noise in the galaxies decrease, the stars also become better 
measured and the PSF model becomes smoother and easier 
to model. But since we have avoided PSF estimation in this 
analysis, Figure [5] should be a good approximation for both 
the single exposure and the co-added image. 



5.4 Combining data from multiple bands 

In most existing weak lensing analyses, only single-band 
data are used, as the surveys are usually designed to have the 
best image quality data in a single band. For LSST, both 
i- and r-band data should be sufficiently good for lensing 
analyses. We thus consider the case for which both i- and r- 
band images are used. Comparing the first and the last row 
in Table [l] we see that the addition of i-band data results 
in a ~20% growth in rieft and not a naive V2 ^ 1.4 factor 
increase ( [Jarvis &: Jain| '2008). This is because i-band im- 
ages generally have lower signal-to-noise ratio (higher back- 
ground as seen in Figure [2| and thus higher measurement 
noise. We also estimate Ticfi for the case of combining all six 
filter bands using the same method. We find that neft ~ 71, 
55 and 36 arcmin"'^ for the optimistic, fiducial and conser- 
vative cases, respectively. That is, nefi increases by a factor 
of ~ 1.8 going from one filter (r) to six filters and ~1.5 go- 
ing from two filters (r+i) to six filters. Again, the gain is 
smaller than \/6 ~ 2.4 and \/3 ~ 1.7. In addition to the fact 
that all other bands are shallower than r-band, the average 
seeing in other bands is generally similar or worse than the 
r- and i-band seeing. 

In the estimation of ncff above for multi-band datasets, 
we have made several assumptions. First, we assumed that 
individual galajcy shapes are the same in the different filter 
bands. This is a good assumption according to jJarvis fc Jain| 
(2008|), who showed that the shape measurements in deep 
multi-band space data are highly correlated between differ- 
ent filter bands. Second, we have assumed that the mea- 
surement noise in each band depends on the galaxy's signal- 
to-noise and effective size in the same way as the r-band 



images {i.e., the coefficients of Equation 13 are the same 



This assumption could fail if, for example, the PSF mod- 
eling algorithm does not perform equally well in all bands. 
However, since we have minimized the effect from the PSF 
modeling procedure in our analysis, this should be a good 
assumption. Finally, we made the assumption that the joint- 
fitting algorithm is capable of operating on the multi-band 
dataset. 

Given the assumptions discussed above, a more detailed 
study is required to understand the actual quantitative gain 
in combining data from multiple filter bands. Nevertheless, 
we point out here that it is in principle possible to achieve 
large noff values by combining data from all the available 
filters, even if some filters are not optimized for lensing. This 
conclusion is consistent with that found in IJarvis fc JainI 
( |2008| ). 



5.5 Redshift distribution of Uch 

Finally, we look at the redshift distribution of ncff. In Fig- 
ure [T] we plot neft in 16 redshift bins from redshift to 4 for 
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Figure 7. For the optimistic (fc = 2.0, red squares), fiducial (k = 
1.0, green diamonds) and the conservative (fc = 0.5, blue circles) 
scenarios, n^ff calculated in different redshift bins, where each 
redshift bin spans an interval of Az =0.25. The extreme bins (z > 
3) are usually rejected in cosmic shear analyses due to inaccuracy 
in the photometric redshift estimation. The grey band shows the 
raw galaxy number density n before any galaxy selection cut is 
applied, which peaks at a higher redshift compared to the n^ff 
distributions. The black dashed curve is the best-fit functional 
form (Equation |21| for the fiducial case. 



the optimistic, fiducial and conservative scenarios. Also plot- 
ted is the raw galaxy number density n before any galaxy 
selection cut is applied, normalized to similar level as the 
other curves for qualitative comparison. We see that the 
true galaxy redshift distribution peaks at a redshift approx- 
imately 0.35-0.5 greater than the rieft redshift distribution. 
This is because the high-redshift galaxies are usually poorly 
measured and contribute less to Ucs, causing the red- 
shift distribution to shift toward lower redshift. We fit the 
rieft redshift distribution with the following functional form, 
which is often used to characterize the galaxy redshift dis- 
tribution. 



P{z) = z"exp 



(21) 



and list the fitted coefficients in the first three rows of Ta- 
ble |2] The median redshift Zm extracted from the curves 
in Figure [7| is also listed in Table |2] The shape of the dis- 
tribution is somewhat different from that measured in low- 
redshift galaxies, where a ~ 2 and P ~ 1.5 (Smail et al. 



19951 



Figure [7] and Table [2] show the first attempt to date to 
quantitatively calculate the redshift distribution of ries- This 
distribution can be used for more realistic Fisher-matrix 
forecasting and simulations of cosmic shear surveys. Pre- 
vious calculations ( |HU||1999, ^Takada fc Jain||2004[ |Amara| 
& Refregier 2007[ ) adopt the raw galaxy redshift distribu- 
tion for these applications, which could lead to an over- 
estimation of the constraining power of the cosmic shear 
surveys. 



Table 2. Best-fit coefficients for Equation |2 1 1 that describes the 
redshift distribution of n^g (the three curves in FigurejTj. k refers 
to different galaxy selection cuts (Equation |19[ . Also listed is Zm, 
the median redshift for the n^g distributions. For comparison, the 
last column lists the best-fit coefficients and median redshift for 
the raw galaxy sample n (corresponding to the thick grey band 
in Figure [7|. 
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Figure 8. Fraction of galaxies that have neighbors with a "center- 
to-center" distance less than d, or blended, as a function of the 
galaxy number density n. The data points are overlaid by the 
best-fit function (Equation |22| Table [sjl for the three d values 
with the black dashed curve. The three cases plotted here corre- 
spond to conservative (d =3"), fiducial [d =2"), and optimistic 
(d =1") de-blending algorithms expected for LSST. The orange 
star indicates the galaxy number density and fraction of blended 
galaxies in the CFHTLenS dataset. The de-blending algorithm 
used in CFHTLenS corresponds to d ^ 2.7" in our simple model. 



6 EFFECTS OF OTHER PRACTICALITIES 

In addition to the considerations discussed above, several 
practical factors affect ricff at levels that cannot be ne- 
glected. We roughly estimate the level of these effects. The 
results of this section are summarized in Table 2] 

(1) Blending 

After the selection cut, depending on the galaxy number 
density, a fraction of the galaxies will be rejected because 
they are blended with other close-by objects. At the low 
galaxy number density in existing data, this effect is not 
severe. As a result, the effect of blending has not been stud- 
ied in detail for previous weak lensing analyses. However, 
given the high galaxy number density expected for LSST 
(Table [l]), the effect can be significant. We estimate below 
roughly the effect of blending for LSST. A more detailed 



in preparation 



study will be presented in Kirkby et al 

First, we assume a galaxy is "blended" when there are 
other objects with center positions within a radius d of the 
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Table 3. Best-fit coeflicients for Equation |22| that describes the 
fraction of galaxies blended as a function of galaxy number den- 
sity. The fit is evaluated assuming different de-blending algo- 
rithms. The three d values correspond to conservative {d =3"), 
fiducial (d =2"), and optimistic (d =1") de-blending algorithms 
expected for LSST. 



d 



V 



1" 0.50 1.9 xlO-3 
2" 0.68 5.5 xlO-3 
3" 0.62 1.4 xl0~2 



the galaxy's center, where d depends on the capability of 
the de-blending algorithm used. In other words, galaxies 
with neighbors closer than d will be rejected by a certain 
de-blending algorithm because attempting to de-blend these 
objects will introduce significant systematic errors on their 
shape measurements. Next, we use the same galaxy cata- 
log introduced in Section [3] and plot the fraction of blended 
galaxies as a function of galaxy number density in Figure [S] 
for three d values. For de-blending treatments in existing 
datasets, conservative approaches at the level of d ~3" are 
used (see Section [7|. We expect that when the LSST survey 
begins, de-blending algorithms will be improved, and objects 
separated by approximately twice the width of the PSF, or 
«2", could be properly de-blended, d «1" represents an opti- 
mistic case for which the de-blending algorithm is capable of 
dealing with objects separated by approximately the width 
of the PSF. 

The blending fraction -Fbiond in Figure [8] can be well de- 
scribed by the functional form 



i(n) = T;ln(l + fin) 



(22) 



where rj and fi are coefficients to fit for, and n is the 
galaxy number density. We list the best-fit coefficients in 
Table [S] and overlay the best-fit function to the points in 
Figurels] In this simplistic blending model, riss is degraded 
to (1 — -Fbiond)nGff , given a certain d value. Note that the 
degradation of Uefs from blending, or Fbiond, depends on the 
raw galaxy number density n and not jicfi. We list the re- 
sulting rics for d —2" in the second row of Table |4] 

(2) Masking 

Parts of the images will be masked due to bright stars (gen- 
erating diffraction spikes, saturated columns, and large dif- 
fuse halos) and edge effects. Masking can be combined into 
the rieff calculation or simply accounted for by claiming a 
smaller survey area. In this paper, we choose to account 
for it in UeS so that the total survey area (18,000 degree^) 
is consistent with what one would assume in Fisher-matrix 
calculations for LSST. 

The fraction of area masked depends heavily on the field 
observed. For pointings near the Galactic plane, many more 
bright stars will need to be masked. We assume a typical 
weak lensing field at moderate Galactic latitude 6 ~ 60 de- 
grees. At this latitude, we estimate approximately 15% of 
the image area is masked out. The resulting ricft values are 
listed in the last row in Table |4] This fraction of masked 
area is based on typical values obtained in existing datasets 



( Miyazaki et al. 2007 VanderPlas et al. 2012 Heymans et al 



2012b I. Improvement in the masking technique is likely to 



reduce the masked area. 



Table 4. Estimation of the n^g accounting for degradation 
caused by practical effects in data such as blending and masking, 
for the optimistic (fc = 2.0), fiducial [k = 1.0) and conservative 
(fc = 0.5) galaxy selection cuts. These effective number densities 
are calculated for the last row in Table [l] (r- and i-band data, 
combined through joint-fitting). 
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7 CASE STUDY - n^ff FOR CFHTLENS 

To demonstrate the generality of our approach, we perform 
the same UcB calculation for the most recent CFHTLenS 
dataset. We base the major parameters on the CFHTLenS 
summary paper ( Heymans et al.|2012b H12), the paper de- 
scribing the shear measurement pipeline ( Miller et al.|2012 1 , 
and the public release document ( [Goranova et al.|2012| ). 

First, we point out that in H12, a different definition 
of the "effective number density of weak lensing galaxies" is 
used. To avoid confusion, we will refer to this definition as 
n*g, where 



1 {^w*f 
fFE(<)^- 



(23) 



H12 defined Q* to be the total area of the survey, excluding 
masked regions and the weighting factor w* is a measure of 
the shear measurement error, defined as 



2 2 



2o-2 



(24) 



where o-g is the ID variance in ellipticity of the likelihood 
surface estimate in Zens/t^^ and emax is the maximum al- 
lowed ellipticity. In the limit Cmax — >■ oo, the first term in the 
bracket, {(Jee^ax) / i^max ~ 2CTg), reduces to a^, which can 
be associated with o"^ in our work. In H12, the n*fj for the 
main lensing sample is calculated to be 11 arcmin"^, but UeB 
(as defined in Equation [9| is not calculated. Conceptually, 
these two quantities measure slightly different things: n*ff is 
a measure of the fraction of all the galaxies used that have 
measurement noise large compared to average measurement 
noise. n*g is equal to the raw number density of galaxies 
selected when all the weights are the same and decreases 
as the distribution of the weights becomes broader. On the 
other hand, Wcff is defined specifically to measure the abso- 
lute statistical power of a weak lensing dataset and is always 
smaller than the raw number density of galaxies even when 
all the galaxies have the same nonzero measurement noise. 
In the case of a very conservative cut, where most galaxies 
selected have low measurement noise and similar weighting 
factor, the n*ff can be close to UcH since they both approach 
the raw number density of galaxies. 

The CFHTLenS dataset covers an area of 154 square 
degrees, at approximately uniform depth. Each patch of sky 
is imaged 6-7 times with the exposure time in each image 



1'^ CFHTLenS use s lensfit ( [Miller et al.|2007||Kitching et al.|2008l 
Miller et al.|2012[ | as the main shear measurement algorithm. 
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being 600 - 700 s. For simplicity, we assume that all fields 
are imaged 7 times, for 615 s each, making the total expo- 
sure time for each galaxy close to the actual total exposure 
time of ~ 4300 seconds. Lensing analyses are performed only 
in the i-band, where the seeing conditions are particularly 
good - the mean seeing is 0.64" and all images have seeing 
better than 0.8". We assume the 7 exposures have the follow- 
ing equally spaced seeing values: [ 0.48", 0.53", 0.59", 0.64", 
0.69", 0.75", 0.80" ]. We also assume that the sky back- 
ground is constant at 20.0 mag/arcsec^. A magnitude cut of 
lAB < 24.7 is placed to ensure the shape measurements are 
accurate; we replace the arn cut with this magnitude cut. 
There is no explicit cut in galaxy size in the shear measure- 
ment algorithm used in CFHTLen^^ but since the galaxy 
models have a prior on the scale length set to a minimum of 
0.3 pixels, we use that as a measure of the implicit size cut. 
Using the mean seeing 0.64"/ (0.187" /pixel) = 3.5 pixels, 
we have Rent = (0.3/3.5)^ « 0.007. An additional redshift 
cut (0.2 < z; < 1.3) is applied to the galaxy sample used for 
the lensing analysis to ensure accurate photometric redshift 
estimates. 20% of the galaxies are rejected due to serious 
blending. 19% of the survey is masked due to bright stars. 
A total of 39% of the area is rejected when including the 
above masking and systematics diagnostics. 

The number density of galaxies with shape and redshift 
measurements is ~ 17 arcmin"^, which went into the calcu- 
lation of n*g and the lensing analysis. Note that this is much 
lower than the raw number density of galaxies at ias < 24.7 
(~ 35 arcmin"'^). Four effects contribute to the reduced 
galaxy number density (Miller, private communication): (1) 
the redshift cut at 0.2 < z < 1.3, (2) incompleteness in 
the photometric redshift data, (3) rejection of large galaxies 
that are larger than the size of the postage stamps (~ 9" 
on a side), and (4) rejection of blended galaxies. Given the 
expected raw number density of galaxies (~ 35 arcmin"^) 
and the final fraction of blended galaxies (20%), we calcu- 
late that the effective blending criteria used in CFHTLenS 
corresponds to d ~ 2.7" in Figure [S] 

Putting in the above numbers, we calculate that with- 
out any blending rejection and masking, we have Heft ~ 13 
arcmin"^, a factor of ~ 3 fewer than the LSST UeB for r- 
band and the fiducial galaxy selection cut. Taking into ac- 
count a 20% blending rejection, we have rics ~ 10 arcmin"'^. 
As expected for the rather conservative galaxy selection cut, 
this riofi value is close to the n*fj « 11 arcmin"^ calculated 
in H12. Finally, taking into account the 39% rejected area, 
which is avoided in H12 by considering the un-rejected area 
only, we have UeB ^ 6 arcmin^^. This number, together with 
a shape noise of crsjv « 0.26 and 154 square degree survey 
area make up a self-consistent set of parameters that quan- 
tifies the statistical errors in the CFHTLenS cosmic shear 
measurement. Note that one should be careful not to com- 
pare the two numbers, Ueff and n*g, directly as they measure 
slightly different properties of the galaxy population. 



Table 5. Main survey and analysis parameters from CFHTLenS 
related to the n^ff calculation. The asterisks indicate that the 
values are estimated or approximated from the real parameters 
in the survey. When applicable, the parameters are specified for 
the CFHT j-band. 



Total survey area 
Number of exposures* 
Exposure time per exposure* 

Seeing distribution* 

Sky background* 
Wavelength range 
Pixel scale 
Redshift range 
Magnitude range 
Size cut* 

Fraction of galaxies 

rejected due to blending 
Fraction of area rejected 



154 (deg2) 
7 

615 (s) 

[ 0.48, 0.53, 0.59, 0.64, 
0.69, 0.75, 0.80 ] (") 
20 (mag/arcsec^) 
685 - 840 (nm) 
0.187 ("/pixel) 
0.2 - 1.3 
lAB < 24.7 
0.007 

20% 

39% 



lensfit attempts to fit all detected objects with star and galaxy 
models and then classifies them as stars or galaxies according to 
which model gives a higher posterior probability. 



8 CONCLUSION 

The effective number density of weak lensing galaxies, or 
riefi, is a measure of the statistical power of a weak lens- 
ing survey. In this paper, we have conducted a detailed and 
systematic analysis to calculate rieB for LSST. Our analysis 
considers all major components in a weak lensing pipeline 
including the galaxy population of interest, the distribution 
of observing conditions, the measurement errors, the galaxy 
selection procedure, the approach for combining multiple 
exposures, and blending/ masking effects. By using realis- 
tic simulations, we estimate that with current weak lensing 
algorithms (the fiducial scenario), rics ~ 37 arcmin"'^ be- 
fore considering masking and blending, UcB ~ 31 arcmin"'^ 
when rejecting the blended galaxies and ricS ~ 26 arcmin"'^ 
when factoring in a 15% masked area. This is estimated for 
LSST after combining all the r- and i-band data in the full 
10-year survey on a 18,000 degree'^ survey area. With im- 
provement in the weak lensing analysis algorithms, we can 
expect (optimistically) UcB ~ 48 arcmin^^ before account- 
ing for masking and blending, UcB ~ 36 arcmin"^ when 
blended galaxies are rejected and Ucb ~ 31 arcmin"^ with 
15% of the area masked. (Table |4|. We have shown quan- 
titatively how improvement in the weak lensing algorithm 
as well as de-blending/masking techniques can lead to large 
improvements in ricS- 

Different schemes for combining multiple exposures are 
discussed in this paper. We find that using a co-add method 
reduces Ucb by 7% compared to a joint-fit method and opti- 
mally combining data from all six filters could increase n^B 
by a factor of 1.5 compared to using only the r- and i-band 
(lensing-optimized) data. We also quantify for the first time 
the redshift distribution of UcB , which has a median redshift 
0.35-0.5 lower than the raw galaxy distribution (Figure [7|. 
Finally, we demonstrate how the same methodology can be 
applied to current weak lensing surveys and show that the 
results are reasonable. 

Now we review and compare our results to the n^s val- 
ues for LSST that have been estimated in previous studies. 
Before comparing our results with the different studies, it 
is important to realize two issues regarding the definition 
of UcB- First, many other papers define and quote differ- 
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ently the "effective number density of galaxies used for weak 
lensing measurements". H12 is an example, described ear- 



lier (Equation 23 1, that has an entirely different definition 
and underlying meaning of rigft. Huterer et al. (20061 and 



Paulin-Ifenriksson et al. |2008 1 , on the other hand, use the 



number rig « 30 arcmin" in their analyses, where ng is the 
raw number of galaxies instead of the weighted rioff . Second, 
most previous studies do not consider the masked area in 
the Uefi calculation. To perform a fair comparison, we will 
thus compare our n^e estimation before masking (second 
row in Table [4| with other studies. That is, we have ricff ~ 
36 arcmin"^ (optimistic), rieff ~ 31 arcmin^^ (fiducial), and 
riefi ~ 22 arcmin"^ (conservative). Under this definition, one 
needs to specify an additional estimate of the masked sur- 
vey area to fully characterize the statistical power in weak 
lensing for LSST. 

40 



In LSST Science Collaboration et al. 



(20091, neff 



arcmm " was estimated for LSST. This estimate was based 
on scaling the measurement in [Clowe et al.| ( |2006[ ) to the 
LSST depth and includes both r- and i-band images, with 
a rough estimate of blending. To perform a fair compari- 
son, we note that at the time when the analysis of IClowe] 
et al. ( 2006 1 was conducted, it was common to use a shear 
measurement algorithm that is subject to either the "con- 
servative" or the "fiducial" scenario in this paper. This sug- 
gests that our estimates in this work, n^s ~ 31 arcmin"'^ 
and 36 arcmin"^, are relatively lower. We attribute this to 
the fact that first the Clowe study is based on a relatively 
small area of sky; thus the conclusions may not be general. 
Second, their blending estimation may be overly optimistic. 
Similarly, an independent group approached the problem by 
comparing HST and Subaru observations of the same field 
and combining multiple exposures with algorithms that take 



into account the PSF variation over time ( Jee & Tyson 2011 



Tyson, Dawson and Jee, private communication). They con- 
clude rieff ~ 36 arcmin^^ for the full LSST r- and i-band 
dataset, consistent with our optimistic estimate. In A06, 
was estimated to be ~ 30 arcmin"^ in the pessimistic sce- 
nario and ~ 40 arcmin"^ in the optimistic scenario. This is 
fairly consistent with what we have estimated for the "fidu- 
cial" and "optimistic" scenarios. With the detailed study in 
this paper we have a much better understanding of the ori- 
gin of UeS and the multiple factors that can affect its level. 
Assumptions about these factors should be specified when 
stating any Ticff estimate. 

Although the focus of this paper is the statistical errors 
for future lensing surveys, it is important to realize that in 
many cases systematic errors are inevitably coupled with the 
statistical errors in lensing measurements. In Section [4.4[ for 
example, we select galaxies with measurement noise lower 
than a certain threshold for cosmic shear measurement. The 
main purpose of this selection cut is to avoid systematic er- 
rors in shear measurements, which become large for faint 
and small galaxies. Similarly, in Section [6] we reject galaxies 
that have close neighbors. This is to avoid shear measure- 
ment biases from errors in the de-blending process. Finally, 
incorporating data from filters that are not optimized for 
lensing does not increase n^^s significantly, but multi-color 
data is potentially useful for identifying and quantifying sys- 
tematic uncertainties. The balance between statistical and 
systematic errors is the key issue to address when designing 
lensing pipelines in the next generation surveys. After all, 



it is the combination of the statistical and systematic errors 
that determine the ultimate uncertainties in the cosmologi- 
cal parameters. 
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APPENDIX A: CALCULATING THE 
SIGNAL-TO-NOISE RATIO {v) AND 
EFFECTIVE SIZE (R) OF GALAXIES 

For the analysis in this paper, we need to connect the in- 
put galaxy and observational parameters with the quantities 



measured from real images. This conversion is essential for 
calculating the signal-to-noise ratio (i^) and effective size (R) 
of galaxies in our analysis. 



Al Conversion of galaxy model parameters to 
observable quantities 

For the bulge+disk galaxy model provided by CatSim, we 
are given the half-light radius of the bulge and the disk sep- 
arately, the total magnitude, and the ratio of flux in the 
bulge to the total flux (see Figure [TJ . The bulge and disk 
are modeled by Sersic profiles with Sersic index n = 4 and 
n — 1 respectively. When measuring the galaxy parameters 
in real data, the galaxy is measured as a whole (bulge+disk). 
Moreover, sizes of galaxies are often measured through mo- 
ments of the light distribution rather than half-light radius. 
Here we attempt to connect the two conventions. Imagine a 
galaxy with the following profile: 



1/4 



Hr) = Itotlfte ^-V ■ +(l-/6)e 



(Al) 



where the subscript '6' and 'd' indicates the bulge and disk 
respectively, and rt, Vd are the scale- length for the bulge 
and disk component. First, we can calculate numerically the 
relation between the scale lengths and the half-light radii, 
rh,b and rh,d, for the individual components: 



Th.h ~ 3459ri, 
rh,d ~ 1.68rd 



(A2) 



Next, we solve analytically for the second-moment ra- 
dius of these two components: 



rsec,b ~ 16108r6 4.66r-h,i 



(A3) 



(A4) 



rsecd ~ 2.45rd « lAGrh.d , 
where the second-moment radius is defined by 

_ Cf^-" I{r)rdrde _ 
Since the moments are additive component-wise, we have: 

(A6) 



(A5) 



L,b + (1 - f»)rl 



= x//6(4.66rh,6)2 + (1 - /6)(1.46rh,d)2 



A2 Connecting input and measured 

signal-to- noise ratio (i/) and effective size (R) 

Now we can calculate Equation [14] and Equation [15] from 
the input catalogs as well as the measured images. 

From the input galaxy catalog, we first calculate the 
galaxy's effective size R by finding rgai and rpsF separately. 
We use the second line of Equation |A6| to calculate Tgai — 



and then calculate rpsp from 



rpsp = 



V2r 



2V21n2 



(A7) 



where the Vsoeing' factor is usually expressed in terms of the 
ID full-width-half-maximum (FWHM) size, and the \/2 in 
the numerator is to convert the ID quantity to 2D. Next, 
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Table Al. Average throughput for LSST across each filter band 
at airmass 1.2. This includes the throughput of the atmosphere, 
the optics and the detectors. 



filter 



y 



throughput (%) 23.5 46.9 51.1 49.0 47.2 18.i 



Table Bl. n^fi in the case of a more realistic PSF interpola- 
tion method, compared to the ideal PSF model used in the main 
analysis, for the optimistic (fe = 2.0), fiducial (fc = 1.0) and con- 
servative (fc = 0.5) galaxy selection cuts (Equation [T9j. The dif- 
ference in the PSF interpolation scheme causes a small (3-4%) 
degradation in n^g. 

k 2.0 1.0 0.5 



True PSF 
Interpolated PSF 



48 
46 



37 
36 



24 
23 



Figure Al. Relation between the second-moment radius and the 
aperture radius as measured from simulations with a measure- 
ment noise cut k = 1.0 (Equation \19\ . The relation is approxi- 
mately linear and can be fitted with the black dashed line with 
slope ~1.64. 



to calculate the signal-to-noise ratio we first derive the 
second-moment radius of the convolved galaxy image using 



''gal + »'PSF 



(A8) 



We then need the aperture radius r^p for the convolved 



galaxy. In Kron ( 1980 1, rap is defined to be 



2r 



first 



(A9) 



where /*(r) is the radial profile of the convolved galaxy. 
Instead of calculating this from the catalog, which is com- 
plicated by the PSF convolution, we find empirically the re- 
lation between the second-moment radius and the aperture 
radius in typical galaxies in the simulations used in Sec- 
tion |4.1| Figure |A1| shows that for the galaxy sample with 
measurement noise cut < ctsn, we find a simple linear 
relation between r, the convolved galaxy second-moment ra- 
dius, and Tap, the aperture radius: 



1.64r 



(AlO) 



Varying the measurement noise cut changes the relation, but 
for the galaxy sample of interest here, rap is generally 1.5-2 
times the second-moment radius r. In the main analysis of 
this paper, we use the fiducial cut (fc = 1.0) and assume 
rap ~ 1.64r. In the worse-case scenario where rap ~ 2r (cor- 
responding to k — 0.5), we have ~17% decrease in UcH- The 
source count in this aperture is estimated as 90% of the 
source count derived by the total source magnitude, while 
the background count is just the area of the aperture times 
the background fiux, which is calculated from OpSim. Both 
source and background counts will need to be multiplied 
by the throughput of the system in the filter used. We use 
the average throughput at airmass 1.2 for LSST listed in 
Table rni 

From the measured image, we identify the IMCAT out- 
put parameters 'rg' as the second-moment radius (Equa- 



Tgai/rpsF is then calculated from 

rpSF — rg,star , 



''gal = 



(All) 



(A12) 



The aperture flux (5) and noise {y/S -\- B) are out- 
puts of the Source Extractor catalog ('FLUX_BEST' and 
'FLUXERR BEST '). We follow the convention of |Leau- 
( 2007) and divide the two parameters to get 
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the signal-to-noise ratio {v) of galaxies. 



APPENDIX B: EFFECT OF PSF 
INTERPOLATION 

In the main analysis of this paper we used a rather idealized 
PSF model to avoid the PSF estimation problems that can 
vary significantly from pipeline to pipeline. Here, we demon- 
strate how ricft changes if we consider a more realistic case 
for PSF interpolation. 

PSF interpolation refers to the procedure where we in- 
terpolate the PSF model parameters from the stellar posi- 
tions onto the galaxy positions. Conventionally, one would 
use a smooth low-order polynomial to fit each parameter 
over the field. This interpolation scheme, however, has been 
shown to be problematic when there are high spatial fre- 
quency PSF variations from the atmosphere in short expo- 



sures (Heymans et al. 2012a Chang et al. 2012 1. We examine 
below how this imperfect PSF model degrades n^fi. 

We generate 1,000 simulated images similar to that de- 
scribed in Section [44] In this set of images, we include a re- 



alistic distribution of stars based on Ivezic et al. (20081. We 



tion A5 1 of the measured objects. The effective radius R : 



then interpolate the shape parameters of the stellar images 
to the galaxy positions and perform the same PSF correc- 
tion and shear measurement analysis as before, using these 
interpolated PSF models. The measurement noise surface as 
a function of v and R looks very similar to the corresponding 
interpolated PSF version, but with slightly lower levels. We 
fix c in Equation |13| and get a = 2.28, b — 1.1 when fitted 
to the measurements. This increase in measurement error 
yields a very slight decrease in rias as listed in Table |B1| 
This implies that the actual PSF interpolation method used 
does not seriously affect rieir. 



